<!DOCTYPE html>

<html>

<head>

<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta http-equiv="X-UA-Compatible" content="IE=EDGE" />

<meta name="viewport" content="width=device-width, initial-scale=1" />

<meta name="author" content="Stéphane Dray" />

<meta name="date" content="2023-10-24" />

<title>Guerry data: Spatial Multivariate Analysis</title>

<script>// Pandoc 2.9 adds attributes on both header and div. We remove the former (to
// be compatible with the behavior of Pandoc < 2.8).
document.addEventListener('DOMContentLoaded', function(e) {
  var hs = document.querySelectorAll("div.section[class*='level'] > :first-child");
  var i, h, a;
  for (i = 0; i < hs.length; i++) {
    h = hs[i];
    if (!/^h[1-6]$/i.test(h.tagName)) continue;  // it should be a header h1-h6
    a = h.attributes;
    while (a.length > 0) h.removeAttribute(a[0].name);
  }
});
</script>

<style type="text/css">
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
</style>



<style type="text/css">
code {
white-space: pre;
}
.sourceCode {
overflow: visible;
}
</style>
<style type="text/css" data-origin="pandoc">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } 
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } 
code span.at { color: #7d9029; } 
code span.bn { color: #40a070; } 
code span.bu { color: #008000; } 
code span.cf { color: #007020; font-weight: bold; } 
code span.ch { color: #4070a0; } 
code span.cn { color: #880000; } 
code span.co { color: #60a0b0; font-style: italic; } 
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } 
code span.do { color: #ba2121; font-style: italic; } 
code span.dt { color: #902000; } 
code span.dv { color: #40a070; } 
code span.er { color: #ff0000; font-weight: bold; } 
code span.ex { } 
code span.fl { color: #40a070; } 
code span.fu { color: #06287e; } 
code span.im { color: #008000; font-weight: bold; } 
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } 
code span.kw { color: #007020; font-weight: bold; } 
code span.op { color: #666666; } 
code span.ot { color: #007020; } 
code span.pp { color: #bc7a00; } 
code span.sc { color: #4070a0; } 
code span.ss { color: #bb6688; } 
code span.st { color: #4070a0; } 
code span.va { color: #19177c; } 
code span.vs { color: #4070a0; } 
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } 
</style>
<script>
// apply pandoc div.sourceCode style to pre.sourceCode instead
(function() {
  var sheets = document.styleSheets;
  for (var i = 0; i < sheets.length; i++) {
    if (sheets[i].ownerNode.dataset["origin"] !== "pandoc") continue;
    try { var rules = sheets[i].cssRules; } catch (e) { continue; }
    var j = 0;
    while (j < rules.length) {
      var rule = rules[j];
      // check if there is a div.sourceCode rule
      if (rule.type !== rule.STYLE_RULE || rule.selectorText !== "div.sourceCode") {
        j++;
        continue;
      }
      var style = rule.style.cssText;
      // check if color or background-color is set
      if (rule.style.color === '' && rule.style.backgroundColor === '') {
        j++;
        continue;
      }
      // replace div.sourceCode by a pre.sourceCode rule
      sheets[i].deleteRule(j);
      sheets[i].insertRule('pre.sourceCode{' + style + '}', j);
    }
  }
})();
</script>



<style type="text/css">

div.csl-bib-body { }
div.csl-entry {
clear: both;
}
.hanging div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}
</style>

<style type="text/css">body {
background-color: #fff;
margin: 1em auto;
max-width: 700px;
overflow: visible;
padding-left: 2em;
padding-right: 2em;
font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
font-size: 14px;
line-height: 1.35;
}
#TOC {
clear: both;
margin: 0 0 10px 10px;
padding: 4px;
width: 400px;
border: 1px solid #CCCCCC;
border-radius: 5px;
background-color: #f6f6f6;
font-size: 13px;
line-height: 1.3;
}
#TOC .toctitle {
font-weight: bold;
font-size: 15px;
margin-left: 5px;
}
#TOC ul {
padding-left: 40px;
margin-left: -1.5em;
margin-top: 5px;
margin-bottom: 5px;
}
#TOC ul ul {
margin-left: -2em;
}
#TOC li {
line-height: 16px;
}
table {
margin: 1em auto;
border-width: 1px;
border-color: #DDDDDD;
border-style: outset;
border-collapse: collapse;
}
table th {
border-width: 2px;
padding: 5px;
border-style: inset;
}
table td {
border-width: 1px;
border-style: inset;
line-height: 18px;
padding: 5px 5px;
}
table, table th, table td {
border-left-style: none;
border-right-style: none;
}
table thead, table tr.even {
background-color: #f7f7f7;
}
p {
margin: 0.5em 0;
}
blockquote {
background-color: #f6f6f6;
padding: 0.25em 0.75em;
}
hr {
border-style: solid;
border: none;
border-top: 1px solid #777;
margin: 28px 0;
}
dl {
margin-left: 0;
}
dl dd {
margin-bottom: 13px;
margin-left: 13px;
}
dl dt {
font-weight: bold;
}
ul {
margin-top: 0;
}
ul li {
list-style: circle outside;
}
ul ul {
margin-bottom: 0;
}
pre, code {
background-color: #f7f7f7;
border-radius: 3px;
color: #333;
white-space: pre-wrap; 
}
pre {
border-radius: 3px;
margin: 5px 0px 10px 0px;
padding: 10px;
}
pre:not([class]) {
background-color: #f7f7f7;
}
code {
font-family: Consolas, Monaco, 'Courier New', monospace;
font-size: 85%;
}
p > code, li > code {
padding: 2px 0px;
}
div.figure {
text-align: center;
}
img {
background-color: #FFFFFF;
padding: 2px;
border: 1px solid #DDDDDD;
border-radius: 3px;
border: 1px solid #CCCCCC;
margin: 0 5px;
}
h1 {
margin-top: 0;
font-size: 35px;
line-height: 40px;
}
h2 {
border-bottom: 4px solid #f7f7f7;
padding-top: 10px;
padding-bottom: 2px;
font-size: 145%;
}
h3 {
border-bottom: 2px solid #f7f7f7;
padding-top: 10px;
font-size: 120%;
}
h4 {
border-bottom: 1px solid #f7f7f7;
margin-left: 8px;
font-size: 105%;
}
h5, h6 {
border-bottom: 1px solid #ccc;
font-size: 105%;
}
a {
color: #0033dd;
text-decoration: none;
}
a:hover {
color: #6666ff; }
a:visited {
color: #800080; }
a:visited:hover {
color: #BB00BB; }
a[href^="http:"] {
text-decoration: underline; }
a[href^="https:"] {
text-decoration: underline; }

code > span.kw { color: #555; font-weight: bold; } 
code > span.dt { color: #902000; } 
code > span.dv { color: #40a070; } 
code > span.bn { color: #d14; } 
code > span.fl { color: #d14; } 
code > span.ch { color: #d14; } 
code > span.st { color: #d14; } 
code > span.co { color: #888888; font-style: italic; } 
code > span.ot { color: #007020; } 
code > span.al { color: #ff0000; font-weight: bold; } 
code > span.fu { color: #900; font-weight: bold; } 
code > span.er { color: #a61717; background-color: #e3d2d2; } 
</style>




</head>

<body>




<h1 class="title toc-ignore">Guerry data: Spatial Multivariate
Analysis</h1>
<h4 class="author">Stéphane Dray</h4>
<h4 class="date">2023-10-24</h4>


<div id="TOC">
<ul>
<li><a href="#preliminary-steps" id="toc-preliminary-steps">Preliminary
steps</a></li>
<li><a href="#standard-approaches" id="toc-standard-approaches">Standard
approaches</a>
<ul>
<li><a href="#multivariate-analysis" id="toc-multivariate-analysis">Multivariate analysis</a></li>
<li><a href="#spatial-autocorrelation" id="toc-spatial-autocorrelation">Spatial autocorrelation</a></li>
<li><a href="#indirect-integration-of-multivariate-and-geographical-aspects" id="toc-indirect-integration-of-multivariate-and-geographical-aspects">Indirect
integration of multivariate and geographical aspects</a></li>
</ul></li>
<li><a href="#spatial-multivariate-analysis" id="toc-spatial-multivariate-analysis">Spatial multivariate analysis</a>
<ul>
<li><a href="#spatial-partition" id="toc-spatial-partition">Spatial
partition</a></li>
<li><a href="#spatial-explanatory-variables" id="toc-spatial-explanatory-variables">Spatial explanatory
variables</a></li>
<li><a href="#spatial-graph-and-weighting-matrix" id="toc-spatial-graph-and-weighting-matrix">Spatial graph and weighting
matrix</a></li>
</ul></li>
<li><a href="#conclusions" id="toc-conclusions">Conclusions</a></li>
<li><a href="#references" id="toc-references">References</a></li>
</ul>
</div>

<p>This vignette indicates how to perform the analyses described in
<span class="citation">Dray and Jombart (<a href="#ref-SD966">2011</a>)</span> of data derived from André-Michel
Guerry’s <span class="citation">(<a href="#ref-SD955">1833</a>)</span>
<em>Essai sur la Statistique Morale de la France</em>. It illustrates
some classical methods for analysis of multivariate spatial data that
focus <em>either</em> on the multivariate aspect or on the spatial one,
as well as some more modern methods that attempt to integrate
geographical and multivariate aspects <em>simultaneously</em>.</p>
<div id="preliminary-steps" class="section level1">
<h1>Preliminary steps</h1>
<p>Several packages are required to run the different analyses and
should be loaded. The <code>ade4</code> package for multivariate
analysis is supplemented by <code>adegraphics</code> (for associated
graphical methods) and <code>adespatial</code> for multivariate spatial
analysis. For further information on this spatial analysis approach, see
<code>vignette(package=&quot;adespatial&quot;, &quot;tutorial&quot;)</code>.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="fu">library</span>(Guerry)       <span class="co"># Guerry&#39;s data</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a><span class="fu">library</span>(sp)           <span class="co"># management of spatial data</span></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a><span class="fu">library</span>(ade4)         <span class="co"># multivariate analysis</span></span>
<span id="cb1-4"><a href="#cb1-4" tabindex="-1"></a><span class="fu">library</span>(adegraphics)  <span class="co"># graphical representation</span></span>
<span id="cb1-5"><a href="#cb1-5" tabindex="-1"></a><span class="fu">library</span>(spdep)        <span class="co"># spatial dependency</span></span>
<span id="cb1-6"><a href="#cb1-6" tabindex="-1"></a><span class="fu">library</span>(adespatial)   <span class="co"># multivariate spatial analysis</span></span></code></pre></div>
<p>Guerry gathered data on crimes, suicide, literacy and other moral
statistics for various départements (i.e., counties) in France. He
provided the first real social data analysis, using graphics and maps to
summarize this georeferenced multivariate dataset. We use the dataset
<code>gfrance85</code> and consider six key quantitative variables
(shown in the table below) for each of the 85 départements of France in
1830 (Corsica, an island and often an outlier, was excluded).</p>
<p>Data are recorded on aligned scales so that <strong>larger
numbers</strong> consistently reflect “morally better”. Thus, four of
the “bad” variables are recorded in the inverse form, as “Population per
…”. With this scaling, it would be expected that all correlations be
<span class="math inline">\(\ge 0\)</span>.</p>
<table>
<thead>
<tr class="header">
<th align="left">Name</th>
<th align="left">Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left"><code>Crime_pers</code></td>
<td align="left">Population per crime against persons</td>
</tr>
<tr class="even">
<td align="left"><code>Crime_prop</code></td>
<td align="left">Population per crime against property</td>
</tr>
<tr class="odd">
<td align="left"><code>Literacy</code></td>
<td align="left">Percent of military conscripts who can read and
write</td>
</tr>
<tr class="even">
<td align="left"><code>Donations</code></td>
<td align="left">Donations to the poor</td>
</tr>
<tr class="odd">
<td align="left"><code>Infants</code></td>
<td align="left">Population per illegitimate birth</td>
</tr>
<tr class="even">
<td align="left"><code>Suicides</code></td>
<td align="left">Population per suicide</td>
</tr>
</tbody>
</table>
<p>The dataset <code>gfrance85</code> is actually a
<code>SpatialPolygonsDataFrame</code> object created with the
<code>sp</code> package. It contains the polygon boundaries of the map
of France in 1830, as well the variables in the <code>Guerry</code> data
frame. As an ordinary data.frame, it has these components</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="fu">names</span>(gfrance85)</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="co">#&gt;  [1] &quot;CODE_DEPT&quot;       &quot;COUNT&quot;           &quot;AVE_ID_GEO&quot;      &quot;dept&quot;           </span></span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="co">#&gt;  [5] &quot;Region&quot;          &quot;Department&quot;      &quot;Crime_pers&quot;      &quot;Crime_prop&quot;     </span></span>
<span id="cb2-4"><a href="#cb2-4" tabindex="-1"></a><span class="co">#&gt;  [9] &quot;Literacy&quot;        &quot;Donations&quot;       &quot;Infants&quot;         &quot;Suicides&quot;       </span></span>
<span id="cb2-5"><a href="#cb2-5" tabindex="-1"></a><span class="co">#&gt; [13] &quot;MainCity&quot;        &quot;Wealth&quot;          &quot;Commerce&quot;        &quot;Clergy&quot;         </span></span>
<span id="cb2-6"><a href="#cb2-6" tabindex="-1"></a><span class="co">#&gt; [17] &quot;Crime_parents&quot;   &quot;Infanticide&quot;     &quot;Donation_clergy&quot; &quot;Lottery&quot;        </span></span>
<span id="cb2-7"><a href="#cb2-7" tabindex="-1"></a><span class="co">#&gt; [21] &quot;Desertion&quot;       &quot;Instruction&quot;     &quot;Prostitutes&quot;     &quot;Distance&quot;       </span></span>
<span id="cb2-8"><a href="#cb2-8" tabindex="-1"></a><span class="co">#&gt; [25] &quot;Area&quot;            &quot;Pop1831&quot;</span></span></code></pre></div>
<p>To simplify analyses, we extract the components to be used below.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" tabindex="-1"></a> <span class="fu">data</span>(gfrance85)</span>
<span id="cb3-2"><a href="#cb3-2" tabindex="-1"></a> df           <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(gfrance85)[, <span class="dv">7</span><span class="sc">:</span><span class="dv">12</span>]    <span class="co"># the 6 variables</span></span>
<span id="cb3-3"><a href="#cb3-3" tabindex="-1"></a> france.map   <span class="ot">&lt;-</span> <span class="fu">as</span>(gfrance85, <span class="st">&quot;SpatialPolygons&quot;</span>) <span class="co"># the map</span></span>
<span id="cb3-4"><a href="#cb3-4" tabindex="-1"></a> xy           <span class="ot">&lt;-</span> <span class="fu">coordinates</span>(gfrance85)           <span class="co"># spatial coordinates</span></span>
<span id="cb3-5"><a href="#cb3-5" tabindex="-1"></a> dep.names    <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(gfrance85)[, <span class="dv">6</span>]       <span class="co"># departement names</span></span>
<span id="cb3-6"><a href="#cb3-6" tabindex="-1"></a> region.names <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(gfrance85)[, <span class="dv">5</span>]       <span class="co"># region names</span></span>
<span id="cb3-7"><a href="#cb3-7" tabindex="-1"></a> col.region   <span class="ot">&lt;-</span> <span class="fu">colors</span>()[<span class="fu">c</span>(<span class="dv">149</span>, <span class="dv">254</span>, <span class="dv">468</span>, <span class="dv">552</span>, <span class="dv">26</span>)] <span class="co"># colors for region</span></span></code></pre></div>
</div>
<div id="standard-approaches" class="section level1">
<h1>Standard approaches</h1>
<p>In this section, we focus on classical approaches that consider
either the multivariate or the spatial aspect of the data.</p>
<div id="multivariate-analysis" class="section level2">
<h2>Multivariate analysis</h2>
<p>Here we consider <span class="math inline">\(p=6\)</span> variables
measured for <span class="math inline">\(n=85\)</span> individuals
(départements of France). As only quantitative variables have been
recorded, principal component analysis <span class="citation">(PCA, <a href="#ref-SD308">Hotelling 1933</a>)</span> is well adapted. PCA
summarizes the data by maximizing simultaneously the variance of the
projection of the individuals onto the principal axes and the sum of the
squared correlations between the principal component and the
variables.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" tabindex="-1"></a>pca <span class="ot">&lt;-</span> <span class="fu">dudi.pca</span>(df, <span class="at">scannf =</span> <span class="cn">FALSE</span>, <span class="at">nf =</span> <span class="dv">3</span>)</span></code></pre></div>
<p>The biplot is simply obtained by</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" tabindex="-1"></a><span class="fu">biplot</span>(pca, <span class="at">plabel.cex =</span> <span class="fl">0.8</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The first two PCA dimensions account for 35.7% and 20% ,respectively,
of the total variance.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" tabindex="-1"></a>pca<span class="sc">$</span>eig<span class="sc">/</span><span class="fu">sum</span>(pca<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb6-2"><a href="#cb6-2" tabindex="-1"></a><span class="co">#&gt; [1] 35.674507 20.013665 18.367448 11.116102  9.144584  5.683694</span></span></code></pre></div>
<p>Correlations between variables and principal components can be
represented on a correlation circle. The first axis is negatively
correlated to literacy and positively correlated to property crime,
suicides and illegitimate births. The second axis is aligned mainly with
personal crime and donations to the poor.</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb7-1"><a href="#cb7-1" tabindex="-1"></a><span class="fu">s.corcircle</span>(pca<span class="sc">$</span>co)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>Spatial information can be added on the factorial map representing
the projections of départements on principal axes by coloring according
to colors representing the different regions of France.</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1" tabindex="-1"></a><span class="fu">s.label</span>(pca<span class="sc">$</span>li, <span class="at">ppoint.col =</span> col.region[region.names], <span class="at">plabel.optim =</span> <span class="cn">TRUE</span>, <span class="at">plabel.cex =</span> <span class="fl">0.6</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1" tabindex="-1"></a><span class="fu">s.Spatial</span>(france.map, <span class="at">col =</span> col.region[region.names], <span class="at">plabel.cex =</span> <span class="dv">0</span>)</span>
<span id="cb9-2"><a href="#cb9-2" tabindex="-1"></a><span class="fu">s.class</span>(xy, region.names, <span class="at">col =</span> col.region, <span class="at">add =</span> <span class="cn">TRUE</span>, <span class="at">ellipseSize =</span> <span class="dv">0</span>, <span class="at">starSize =</span> <span class="dv">0</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>For the first axis, the North and East are characterized by negative
scores, corresponding to high levels of literacy and high rates of
suicides, crimes against property and illegitimate births. The second
axis mainly contrasts the West (high donations to the the poor and low
levels of crime against persons) to the South.</p>
</div>
<div id="spatial-autocorrelation" class="section level2">
<h2>Spatial autocorrelation</h2>
<p>Spatial autocorrelation statistics, such as <span class="citation">Moran (<a href="#ref-SD455">1948</a>)</span>
Coefficient (MC) and <span class="citation">Geary (<a href="#ref-SD223">1954</a>)</span> Ratio, aim to measure and analyze the
degree of dependency among observations in a geographical context <span class="citation">(<a href="#ref-SD577">Cliff and Ord
1973</a>)</span>.</p>
<div id="the-spatial-weighting-matrix" class="section level3">
<h3>The spatial weighting matrix</h3>
<p>The first step of spatial autocorrelation analysis is to define a
spatial weighting matrix <span class="math inline">\(\mathbf{W}=[w_{ij}]\)</span> . In the case of
Guerry’s data, we simply defined a binary neighborhood where two
départements are considered as neighbors if they share a common border.
The spatial weighting matrix is then obtained after row-standardization
(<code>style = &quot;W&quot;</code>):</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" tabindex="-1"></a>nb <span class="ot">&lt;-</span> <span class="fu">poly2nb</span>(gfrance85)</span>
<span id="cb10-2"><a href="#cb10-2" tabindex="-1"></a>lw <span class="ot">&lt;-</span> <span class="fu">nb2listw</span>(nb, <span class="at">style =</span> <span class="st">&quot;W&quot;</span>)</span></code></pre></div>
<p>We can represent this neighborhood on the geographical map:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" tabindex="-1"></a><span class="fu">s.Spatial</span>(france.map, <span class="at">nb =</span> nb, <span class="at">plabel.cex =</span> <span class="dv">0</span>, <span class="at">pSp.border =</span> <span class="st">&quot;white&quot;</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
<div id="morans-coefficient" class="section level3">
<h3>Moran’s Coefficient</h3>
<p>Once the spatial weights have been defined, the spatial
autocorrelation statistics can then be computed. Let us consider the
<span class="math inline">\(n\)</span>-by-1 vector <span class="math inline">\(\mathbf{x}=\left[ {x_1 \cdots x_n }
\right]^\textrm{T}\)</span> containing measurements of a quantitative
variable for <span class="math inline">\(n\)</span> spatial units. The
usual formulation for Moran’s coefficient of spatial autocorrelation
<span class="citation">(<a href="#ref-SD577">Cliff and Ord 1973</a>; <a href="#ref-SD455">Moran 1948</a>)</span> is <span class="math display">\[\begin{equation}
\label{eq1}
MC(\mathbf{x})=\frac{n\sum\nolimits_{\left( 2 \right)} {w_{ij} (x_i
-\bar
{x})(x_j -\bar {x})} }{\sum\nolimits_{\left( 2 \right)} {w_{ij} }
\sum\nolimits_{i=1}^n {(x_i -\bar {x})^2} }\mbox{ where
}\sum\nolimits_{\left( 2 \right)} =\sum\limits_{i=1}^n
{\sum\limits_{j=1}^n
} \mbox{ with }i\ne j.
\end{equation}\]</span></p>
<p>MC can be rewritten using matrix notation: <span class="math display">\[\begin{equation}
\label{eq2}
MC(\mathbf{x})=\frac{n}{\mathbf{1}^\textrm{T}\mathbf{W1}}\frac{\mathbf{z}^\textrm{T}{\mathbf{Wz}}}{\mathbf{z}^\textrm{T}\mathbf{z}},
\end{equation}\]</span> where <span class="math inline">\(\mathbf{z}=\left (
\mathbf{I}_n-\mathbf{1}_n\mathbf{1}_n^\textrm{T} /n \right
)\mathbf{x}\)</span> is the vector of centered values (<span class="math inline">\(z_i=x_i-\bar{x}\)</span>) and <span class="math inline">\(\mathbf{1}_n\)</span> is a vector of ones (of
length <span class="math inline">\(n\)</span>).</p>
<p>The significance of the observed value of MC can be tested by a
Monte-Carlo procedure, in which locations are permuted to obtain a
distribution of MC under the null hypothesis of random distribution. An
observed value of MC that is greater than that expected at random
indicates the clustering of similar values across space (positive
spatial autocorrelation), while a significant negative value of MC
indicates that neighboring values are more dissimilar than expected by
chance (negative spatial autocorrelation).</p>
<p>We computed MC for the Guerry’s dataset. A positive and significant
autocorrelation is identified for each of the six variables. Thus, the
values of literacy are the most covariant in adjacent departments, while
illegitimate births (Infants) covary least.</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1" tabindex="-1"></a><span class="fu">moran.randtest</span>(df, lw)</span>
<span id="cb12-2"><a href="#cb12-2" tabindex="-1"></a><span class="co">#&gt; class: krandtest lightkrandtest </span></span>
<span id="cb12-3"><a href="#cb12-3" tabindex="-1"></a><span class="co">#&gt; Monte-Carlo tests</span></span>
<span id="cb12-4"><a href="#cb12-4" tabindex="-1"></a><span class="co">#&gt; Call: moran.randtest(x = df, listw = lw)</span></span>
<span id="cb12-5"><a href="#cb12-5" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb12-6"><a href="#cb12-6" tabindex="-1"></a><span class="co">#&gt; Number of tests:   6 </span></span>
<span id="cb12-7"><a href="#cb12-7" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb12-8"><a href="#cb12-8" tabindex="-1"></a><span class="co">#&gt; Adjustment method for multiple comparisons:   none </span></span>
<span id="cb12-9"><a href="#cb12-9" tabindex="-1"></a><span class="co">#&gt; Permutation number:   999 </span></span>
<span id="cb12-10"><a href="#cb12-10" tabindex="-1"></a><span class="co">#&gt;         Test       Obs   Std.Obs   Alter Pvalue</span></span>
<span id="cb12-11"><a href="#cb12-11" tabindex="-1"></a><span class="co">#&gt; 1 Crime_pers 0.4114597  5.797137 greater  0.001</span></span>
<span id="cb12-12"><a href="#cb12-12" tabindex="-1"></a><span class="co">#&gt; 2 Crime_prop 0.2635533  4.160819 greater  0.002</span></span>
<span id="cb12-13"><a href="#cb12-13" tabindex="-1"></a><span class="co">#&gt; 3   Literacy 0.7176053 10.357515 greater  0.001</span></span>
<span id="cb12-14"><a href="#cb12-14" tabindex="-1"></a><span class="co">#&gt; 4  Donations 0.3533613  5.204476 greater  0.001</span></span>
<span id="cb12-15"><a href="#cb12-15" tabindex="-1"></a><span class="co">#&gt; 5    Infants 0.2287241  3.628118 greater  0.002</span></span>
<span id="cb12-16"><a href="#cb12-16" tabindex="-1"></a><span class="co">#&gt; 6   Suicides 0.4016812  6.081180 greater  0.001</span></span></code></pre></div>
</div>
<div id="moran-scatterplot" class="section level3">
<h3>Moran scatterplot</h3>
<p>If the spatial weighting matrix is row-standardized, we can define
the lag vector <span class="math inline">\(\mathbf{\tilde{z}} =
\mathbf{Wz}\)</span> (i.e., <span class="math inline">\(\tilde{z}_i =
\sum\limits_{j=1}^n{w_{ij}x_j}\)</span>) composed of the weighted (by
the spatial weighting matrix) averages of the neighboring values. Thus,
we have: <span class="math display">\[\begin{equation}
\label{eq3}
MC(\mathbf{x})=\frac{\mathbf{z}^\textrm{T}{\mathbf{\tilde{z}}}}{\mathbf{z}^\textrm{T}\mathbf{z}},
\end{equation}\]</span> since in this case <span class="math inline">\(\mathbf{1}^\textrm{T}\mathbf{W1}=n\)</span>. This
shows clearly that MC measures the autocorrelation by giving an
indication of the intensity of the linear association between the vector
of observed values <span class="math inline">\(\mathbf{z}\)</span> and
the vector of weighted averages of neighboring values <span class="math inline">\(\mathbf{\tilde{z}}\)</span>. <span class="citation">Anselin (<a href="#ref-SD566">1996</a>)</span> proposed
to visualize MC in the form of a bivariate scatterplot of <span class="math inline">\(\mathbf{\tilde{z}}\)</span> against <span class="math inline">\(\mathbf{z}\)</span>. A linear regression can be
added to this <em>Moran scatterplot</em>, with slope equal to MC.</p>
<p>Considering the Literacy variable of Guerry’s data, the Moran
scatterplot clearly shows strong autocorrelation. It also shows that the
Hautes-Alpes département has a slightly outlying position characterized
by a high value of Literacy compared to its neighbors.</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1" tabindex="-1"></a>x <span class="ot">&lt;-</span> df[, <span class="dv">3</span>]</span>
<span id="cb13-2"><a href="#cb13-2" tabindex="-1"></a>x.lag <span class="ot">&lt;-</span> <span class="fu">lag.listw</span>(lw, df[, <span class="dv">3</span>])</span>
<span id="cb13-3"><a href="#cb13-3" tabindex="-1"></a><span class="fu">moran.plot</span>(x, lw)</span>
<span id="cb13-4"><a href="#cb13-4" tabindex="-1"></a><span class="fu">text</span>(x[<span class="dv">5</span>], x.lag[<span class="dv">5</span>], dep.names[<span class="dv">5</span>], <span class="at">pos =</span> <span class="dv">1</span>, <span class="at">cex =</span> <span class="fl">0.8</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
</div>
<div id="indirect-integration-of-multivariate-and-geographical-aspects" class="section level2">
<h2>Indirect integration of multivariate and geographical aspects</h2>
<p>The simplest approach considered a two-step procedure where the data
are first summarized with multivariate analysis such as PCA. In a second
step, univariate spatial statistics or mapping techniques are applied to
PCA scores for each axis separately. One can also test for the presence
of spatial autocorrelation for the first few scores of the analysis,
with univariate autocorrelation statistics such as MC. We mapped scores
of the départements for the first two axes of the PCA of Guerry’s data.
Even if PCA maximizes only the variance of these scores, there is also a
clear spatial structure, as the scores are highly autocorrelated. The
map for the first axis corresponds closely to the split between <em>la
France éclairée</em> (North-East characterized by an higher level of
Literacy) and <em>la France obscure</em>.</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" tabindex="-1"></a><span class="fu">moran.randtest</span>(pca<span class="sc">$</span>li, lw)</span>
<span id="cb14-2"><a href="#cb14-2" tabindex="-1"></a><span class="co">#&gt; class: krandtest lightkrandtest </span></span>
<span id="cb14-3"><a href="#cb14-3" tabindex="-1"></a><span class="co">#&gt; Monte-Carlo tests</span></span>
<span id="cb14-4"><a href="#cb14-4" tabindex="-1"></a><span class="co">#&gt; Call: moran.randtest(x = pca$li, listw = lw)</span></span>
<span id="cb14-5"><a href="#cb14-5" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb14-6"><a href="#cb14-6" tabindex="-1"></a><span class="co">#&gt; Number of tests:   3 </span></span>
<span id="cb14-7"><a href="#cb14-7" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb14-8"><a href="#cb14-8" tabindex="-1"></a><span class="co">#&gt; Adjustment method for multiple comparisons:   none </span></span>
<span id="cb14-9"><a href="#cb14-9" tabindex="-1"></a><span class="co">#&gt; Permutation number:   999 </span></span>
<span id="cb14-10"><a href="#cb14-10" tabindex="-1"></a><span class="co">#&gt;    Test       Obs  Std.Obs   Alter Pvalue</span></span>
<span id="cb14-11"><a href="#cb14-11" tabindex="-1"></a><span class="co">#&gt; 1 Axis1 0.5506343 7.704990 greater  0.001</span></span>
<span id="cb14-12"><a href="#cb14-12" tabindex="-1"></a><span class="co">#&gt; 2 Axis2 0.5614030 8.237265 greater  0.001</span></span>
<span id="cb14-13"><a href="#cb14-13" tabindex="-1"></a><span class="co">#&gt; 3 Axis3 0.1805569 2.701054 greater  0.006</span></span>
<span id="cb14-14"><a href="#cb14-14" tabindex="-1"></a><span class="fu">s.value</span>(xy, pca<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], <span class="at">Sp =</span> france.map, <span class="at">pSp.border =</span> <span class="st">&quot;white&quot;</span>, <span class="at">symbol =</span> <span class="st">&quot;circle&quot;</span>, <span class="at">pgrid.draw =</span> <span class="cn">FALSE</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
</div>
<div id="spatial-multivariate-analysis" class="section level1">
<h1>Spatial multivariate analysis</h1>
<p>Over the last decades, several approaches have been developed to
consider both geographical and multivariate information simultaneously.
The multivariate aspect is usually treated by techniques of
dimensionality reduction similar to PCA. On the other hand, several
alternatives have been proposed to integrate the spatial
information.</p>
<div id="spatial-partition" class="section level2">
<h2>Spatial partition</h2>
<p>One alternative is to consider a spatial partition of the study area.
In this case, the spatial information is coded as a categorical
variable, and each category corresponds to a region of the whole study
area. For instance, Guerry’s data contained a partition of France into 5
regions.</p>
<p>We used the between-class analysis <span class="citation">(BCA, <a href="#ref-SD148">Dolédec and Chessel 1987</a>)</span>, to investigate
differences between regions. BCA maximizes the variance between
groups.</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1" tabindex="-1"></a> bet <span class="ot">&lt;-</span> <span class="fu">bca</span>(pca, region.names, <span class="at">scannf =</span> <span class="cn">FALSE</span>, <span class="at">nf =</span> <span class="dv">2</span>)</span></code></pre></div>
<p>Here, 28.8 % of the total variance (sum of eigenvalues of PCA)
corresponds to the between-regions variance (sum of the eigenvalues of
BCA).</p>
<div class="sourceCode" id="cb16"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" tabindex="-1"></a>bet<span class="sc">$</span>ratio</span>
<span id="cb16-2"><a href="#cb16-2" tabindex="-1"></a><span class="co">#&gt; [1] 0.288139</span></span></code></pre></div>
<p>The main graphical outputs are obtained by the generic
<code>plot</code> function:</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1" tabindex="-1"></a><span class="fu">plot</span>(bet)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The barplot of eigenvalues indicates that two axes should be
interpreted. The first two BCA dimensions account for 59 % and 30.2 %,
respectively, of the between-regions variance.</p>
<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1" tabindex="-1"></a> <span class="fu">barplot</span>(bet<span class="sc">$</span>eig)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<div class="sourceCode" id="cb19"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb19-1"><a href="#cb19-1" tabindex="-1"></a> bet<span class="sc">$</span>eig<span class="sc">/</span><span class="fu">sum</span>(bet<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb19-2"><a href="#cb19-2" tabindex="-1"></a><span class="co">#&gt; [1] 58.995823 30.159802  7.417445  3.426930</span></span></code></pre></div>
<p>The coefficients used to construct the linear combinations of
variables are represented:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" tabindex="-1"></a><span class="fu">s.arrow</span>(bet<span class="sc">$</span>c1, <span class="at">plabel.cex =</span> <span class="fl">0.8</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The first axis opposed literacy to property crime, suicides and
illegitimate births. The second axis is mainly aligned with personal
crime and donations to the poor.</p>
<p>Projections of départements on the BCA axes can be represented on the
factorial map:</p>
<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1" tabindex="-1"></a><span class="fu">s.label</span>(bet<span class="sc">$</span>ls, <span class="fu">as.character</span>(dep.names), <span class="at">ppoint.cex =</span> <span class="dv">0</span>, <span class="at">plabel.optim =</span> <span class="cn">TRUE</span>, <span class="at">plabel.col =</span> col.region[region.names], <span class="at">plabel.cex =</span> <span class="fl">0.5</span>)</span>
<span id="cb21-2"><a href="#cb21-2" tabindex="-1"></a><span class="fu">s.class</span>(bet<span class="sc">$</span>ls, <span class="at">fac =</span> region.names, <span class="at">col =</span> col.region, <span class="at">ellipse =</span> <span class="dv">0</span>, <span class="at">add =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The scores can be mapped to show the spatial aspects:</p>
<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb22-1"><a href="#cb22-1" tabindex="-1"></a><span class="fu">s.value</span>(xy, bet<span class="sc">$</span>ls, <span class="at">symbol =</span> <span class="st">&quot;circle&quot;</span>, <span class="at">Sp =</span> france.map, <span class="at">pSp.col =</span> col.region[region.names], <span class="at">pSp.border =</span> <span class="st">&quot;transparent&quot;</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The results are very close to those obtained by PCA: the first axis
contrasted the North and the East (<em>la France éclairée</em>) to the
other regions while the South is separated from the other regions by the
second axis. The high variability of the region Centre is also
noticeable. In contrast, the South is very homogeneous.</p>
</div>
<div id="spatial-explanatory-variables" class="section level2">
<h2>Spatial explanatory variables</h2>
<!-- deleted @PCAIV - not in references -->
<p>Principal component analysis with respect to the instrumental
variables <span class="citation">(PCAIV, <a href="#ref-SD540">Rao
1964</a>)</span>, and related methods, have been often used in community
ecology to identify spatial relationships. The spatial information is
introduced in the form of spatial predictors and the analysis maximized
the “spatial variance” (i.e., the variance explained by spatial
predictors). Note that BCA can also be considered as a particular case
of PCAIV, where the explanatory variables are dummy variables indicating
group membership.</p>
<div id="trend-surface-of-geographic-coordinates" class="section level3">
<h3>Trend surface of geographic coordinates</h3>
<p><span class="citation">Student (<a href="#ref-SD626">1914</a>)</span>
proposed to express observed values in time series as a polynomial
function of time, and mentioned that this could be done for spatial data
as well. <span class="citation">Borcard, Legendre, and Drapeau (<a href="#ref-SD59">1992</a>)</span> extended this approach to the spatial
and multivariate case by introducing polynomial functions of geographic
coordinates as predictors in PCAIV. We call this approach
PCAIV-POLY.</p>
<p>The centroids of départements of France were used to construct a
second-degree orthogonal polynomial.</p>
<div class="sourceCode" id="cb23"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1" tabindex="-1"></a>poly.xy <span class="ot">&lt;-</span> <span class="fu">orthobasis.poly</span>(xy, <span class="at">degree =</span> <span class="dv">2</span>)</span>
<span id="cb23-2"><a href="#cb23-2" tabindex="-1"></a><span class="fu">s.value</span>(xy, poly.xy, <span class="at">Sp =</span> france.map, <span class="at">plegend.drawKey =</span> <span class="cn">FALSE</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>PCAIV is then performed using the <code>pcaiv</code> function:</p>
<div class="sourceCode" id="cb24"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb24-1"><a href="#cb24-1" tabindex="-1"></a>pcaiv.xy <span class="ot">&lt;-</span> <span class="fu">pcaiv</span>(pca, poly.xy, <span class="at">scannf =</span> <span class="cn">FALSE</span>, <span class="at">nf =</span> <span class="dv">2</span>)</span></code></pre></div>
<p>Here, 32.4 % of the total variance (sum of eigenvalues of PCA) is
explained by the second-degree polynomial (sum of eigenvalues of PCAIV).
The first two dimensions account for 51.4 % and 35.2 %, respectively, of
the explained variance.</p>
<div class="sourceCode" id="cb25"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1" tabindex="-1"></a><span class="fu">sum</span>(pcaiv.xy<span class="sc">$</span>eig)<span class="sc">/</span><span class="fu">sum</span>(pca<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb25-2"><a href="#cb25-2" tabindex="-1"></a><span class="co">#&gt; [1] 32.36025</span></span>
<span id="cb25-3"><a href="#cb25-3" tabindex="-1"></a>pcaiv.xy<span class="sc">$</span>eig<span class="sc">/</span><span class="fu">sum</span>(pcaiv.xy<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb25-4"><a href="#cb25-4" tabindex="-1"></a><span class="co">#&gt; [1] 51.423264 35.152362  5.954475  4.799075  2.670825</span></span></code></pre></div>
<p>The outputs of PCAIV-POLY (coefficients of variables, maps of
départements scores, etc.) are very similar to those obtained by BCA.
They can be represented easily by the generic <code>plot</code>
function:</p>
<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1" tabindex="-1"></a><span class="fu">plot</span>(pcaiv.xy)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
<div id="morans-eigenvector-maps" class="section level3">
<h3>Moran’s eigenvector maps</h3>
<p>An alternative way to build spatial predictors is by the
diagonalization of the spatial weighting matrix <strong>W</strong>.
Moran’s eigenvector maps <span class="citation">(MEM, <a href="#ref-SD163">Dray, Legendre, and Peres-Neto 2006</a>)</span> are
the <span class="math inline">\(n-1\)</span> eigenvectors of the
doubly-centered matrix <strong>W</strong>. They are orthogonal vectors
with a unit norm maximizing MC <span class="citation">(<a href="#ref-SD264">Griffith 1996</a>)</span>. MEM associated with high
positive (or negative) eigenvalues have high positive (or negative)
autocorrelation. MEM associated with eigenvalues with small absolute
values correspond to low spatial autocorrelation, and are not suitable
for defining spatial structures.</p>
<p>We used the spatial weighting matrix defined above to construct MEM.
The first ten MEM, corresponding to the highest levels of spatial
autocorrelation, have been mapped:</p>
<div class="sourceCode" id="cb27"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1" tabindex="-1"></a>mem1 <span class="ot">&lt;-</span> <span class="fu">scores.listw</span>(lw)</span>
<span id="cb27-2"><a href="#cb27-2" tabindex="-1"></a><span class="fu">s.value</span>(xy, mem1[, <span class="dv">1</span><span class="sc">:</span><span class="dv">9</span>], <span class="at">Sp =</span> france.map, <span class="at">plegend.drawKey =</span> <span class="cn">FALSE</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>We introduced the first ten MEM as spatial explanatory variables in
PCAIV. We call this approach PCAIV-MEM.</p>
<div class="sourceCode" id="cb28"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1" tabindex="-1"></a>pcaiv.mem <span class="ot">&lt;-</span> <span class="fu">pcaiv</span>(pca, mem1[,<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>], <span class="at">scannf =</span> <span class="cn">FALSE</span>)</span></code></pre></div>
<p>Here, 44.1 % of the total variance (sum of eigenvalues of PCA) is
explained by the first ten MEM (sum of eigenvalues of PCAIV). The first
two dimensions account for 54.9 % and 26.3 %, respectively, of the
explained variance.</p>
<div class="sourceCode" id="cb29"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1" tabindex="-1"></a><span class="fu">sum</span>(pcaiv.mem<span class="sc">$</span>eig)<span class="sc">/</span><span class="fu">sum</span>(pca<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb29-2"><a href="#cb29-2" tabindex="-1"></a><span class="co">#&gt; [1] 44.11597</span></span>
<span id="cb29-3"><a href="#cb29-3" tabindex="-1"></a>pcaiv.mem<span class="sc">$</span>eig<span class="sc">/</span><span class="fu">sum</span>(pcaiv.mem<span class="sc">$</span>eig) <span class="sc">*</span> <span class="dv">100</span></span>
<span id="cb29-4"><a href="#cb29-4" tabindex="-1"></a><span class="co">#&gt; [1] 54.928640 26.300082  9.042300  4.574408  3.532962  1.621607</span></span></code></pre></div>
<p>The outputs of PCAIV-MEM (coefficients of variables, maps of
départements scores, etc.) are very similar to those obtained by BCA.
They can be represented easily by the generic <code>plot</code>
function:</p>
<div class="sourceCode" id="cb30"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1" tabindex="-1"></a><span class="fu">plot</span>(pcaiv.mem)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
</div>
<div id="spatial-graph-and-weighting-matrix" class="section level2">
<h2>Spatial graph and weighting matrix</h2>
<p>The MEM framework introduced the spatial information into
multivariate analysis through the eigendecomposition of the spatial
weighting matrix. Usually, we consider only a part of the information
contained in this matrix because only a subset of MEM are used as
regressors in PCAIV. In this section, we focus on multivariate methods
that consider the spatial weighting matrix under its original form.</p>
<p><span class="citation">Wartenberg (<a href="#ref-SD694">1985</a>)</span> was the first to develop a
multivariate analysis based on MC. His work considered only normed and
centered variables (i.e., normed PCA) for the multivariate part and a
binary symmetric connectivity matrix for the spatial aspect. <span class="citation">Dray, Saïd, and Débias (<a href="#ref-SD807">2008</a>)</span> generalized Wartenberg’s method by
introducing a row-standardized spatial weighting matrix in the analysis
of a statistical triplet. This approach is very general and allows us to
define spatially-constrained versions of various methods (corresponding
to different triplets) such as correspondence analysis or multiple
correspondence analysis. MULTISPATI finds coefficients to obtain a
linear combination of variables that maximizes a compromise between the
classical multivariate analysis and a generalized version of Moran’s
coefficient.</p>
<div class="sourceCode" id="cb31"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb31-1"><a href="#cb31-1" tabindex="-1"></a> ms <span class="ot">&lt;-</span> <span class="fu">multispati</span>(pca, lw, <span class="at">scannf =</span> <span class="cn">FALSE</span>)</span></code></pre></div>
<p>The main outputs of MULTISPATI can be represented easily by the
generic <code>plot</code> function:</p>
<div class="sourceCode" id="cb32"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1" tabindex="-1"></a><span class="fu">plot</span>(ms)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The barplot of eigenvalues suggests two main spatial structures.
Eigenvalues of MULTISPATI are the product between the variance and the
spatial autocorrelation of the scores, while PCA maximizes only the
variance. The differences between the two methods are computed by the
<code>summary</code> function:</p>
<div class="sourceCode" id="cb33"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb33-1"><a href="#cb33-1" tabindex="-1"></a><span class="fu">summary</span>(ms)</span>
<span id="cb33-2"><a href="#cb33-2" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb33-3"><a href="#cb33-3" tabindex="-1"></a><span class="co">#&gt; Multivariate Spatial Analysis</span></span>
<span id="cb33-4"><a href="#cb33-4" tabindex="-1"></a><span class="co">#&gt; Call: multispati(dudi = pca, listw = lw, scannf = FALSE)</span></span>
<span id="cb33-5"><a href="#cb33-5" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb33-6"><a href="#cb33-6" tabindex="-1"></a><span class="co">#&gt; Scores from the initial duality diagram:</span></span>
<span id="cb33-7"><a href="#cb33-7" tabindex="-1"></a><span class="co">#&gt;          var      cum     ratio     moran</span></span>
<span id="cb33-8"><a href="#cb33-8" tabindex="-1"></a><span class="co">#&gt; RS1 2.140470 2.140470 0.3567451 0.5506343</span></span>
<span id="cb33-9"><a href="#cb33-9" tabindex="-1"></a><span class="co">#&gt; RS2 1.200820 3.341290 0.5568817 0.5614030</span></span>
<span id="cb33-10"><a href="#cb33-10" tabindex="-1"></a><span class="co">#&gt; RS3 1.102047 4.443337 0.7405562 0.1805569</span></span>
<span id="cb33-11"><a href="#cb33-11" tabindex="-1"></a><span class="co">#&gt; </span></span>
<span id="cb33-12"><a href="#cb33-12" tabindex="-1"></a><span class="co">#&gt; Multispati eigenvalues decomposition:</span></span>
<span id="cb33-13"><a href="#cb33-13" tabindex="-1"></a><span class="co">#&gt;           eig      var     moran</span></span>
<span id="cb33-14"><a href="#cb33-14" tabindex="-1"></a><span class="co">#&gt; CS1 1.2858683 2.017171 0.6374612</span></span>
<span id="cb33-15"><a href="#cb33-15" tabindex="-1"></a><span class="co">#&gt; CS2 0.6939887 1.176551 0.5898499</span></span></code></pre></div>
<p>Hence, there is a loss of variance compared to PCA (2.14 versus 2.017
for axis 1; 1.201 versus 1.177 for axis 2) but a gain of spatial
autocorrelation (0.551 versus 0.637 for axis 1; 0.561 versus 0.59 for
axis 2).</p>
<p>Coefficients of variables allow to interpret the structures:</p>
<div class="sourceCode" id="cb34"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1" tabindex="-1"></a><span class="fu">s.arrow</span>(ms<span class="sc">$</span>c1, <span class="at">plabel.cex =</span> <span class="fl">0.8</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>The first axis opposes literacy to property crime, suicides and
illegitimate births. The second axis is aligned mainly with personal
crime and donations to the poor. The maps of the scores show that the
spatial structures are very close to those identified by PCA. The
similarity of results between PCA and its spatially optimized version
confirm that the main structures of Guerry’s data are spatial.</p>
<p>Spatial autocorrelation can be seen as the link between one variable
and the lagged vector. This interpretation is used to construct the
Moran scatterplot and can be extended to the multivariate case in
MULTISPATI by analyzing the link between scores and lagged scores:</p>
<div class="sourceCode" id="cb35"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb35-1"><a href="#cb35-1" tabindex="-1"></a><span class="fu">s.match</span>(ms<span class="sc">$</span>li, ms<span class="sc">$</span>ls, <span class="at">plabel.cex =</span> <span class="dv">0</span>)</span>
<span id="cb35-2"><a href="#cb35-2" tabindex="-1"></a><span class="fu">s.match</span>(ms<span class="sc">$</span>li[<span class="fu">c</span>(<span class="dv">10</span>, <span class="dv">41</span>, <span class="dv">27</span>), ], ms<span class="sc">$</span>ls[<span class="fu">c</span>(<span class="dv">10</span>, <span class="dv">41</span>, <span class="dv">27</span>), ], <span class="at">label =</span> dep.names[<span class="fu">c</span>(<span class="dv">10</span>, </span>
<span id="cb35-3"><a href="#cb35-3" tabindex="-1"></a>     <span class="dv">41</span>, <span class="dv">27</span>)], <span class="at">plabel.cex =</span> <span class="fl">0.8</span>, <span class="at">add =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
<p>Each département can be represented on the factorial map by an arrow
(the bottom corresponds to its score, the head corresponds to its lagged
score. A short arrow reveals a local spatial similarity (between one
plot and its neighbors) while a long arrow reveals a spatial
discrepancy. This viewpoint can be interpreted as a multivariate
extension of the local index of spatial association <span class="citation">(<a href="#ref-SD565">Anselin 1995</a>)</span>. For
instance: * Aude has a very small arrow, indicating that this
département is very similar to its neighbors. * Haute-Loire has a long
horizontal arrow which reflects its high values for the variables
Infants (31017), Suicides (163241) and Crime_prop (18043) compared to
the average values over its neighbors (27032.4, 60097.8 and 10540.8 for
these three variables). * Finistère corresponds to an arrow with a long
vertical length which is due to its high values compared to its
neighbors for Donations (23945 versus 12563) and Crime_pers (29872
versus 25962).</p>
<p>The link between the scores and the lagged scores (averages of
neighbors weighted by the spatial connection matrix) can be mapped in
the geographical space. For the first two axes, we have:</p>
<div class="sourceCode" id="cb36"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1" tabindex="-1"></a><span class="fu">s.value</span>(xy, ms<span class="sc">$</span>li, <span class="at">Sp =</span> france.map)</span></code></pre></div>
<p><img src="" style="display: block; margin: auto;" /></p>
</div>
</div>
<div id="conclusions" class="section level1">
<h1>Conclusions</h1>
<p>Even if the methods presented are quite different in their
theoretical and practical viewpoints, their applications to Guerry’s
dataset yield very similar results. We provided a quantitative measure
of this similarity by computing Procrustes statistics [<span class="citation">Peres-Neto and Jackson (<a href="#ref-SD516">2001</a>)</span>;SD161] between the scores of the
départements onto the first two axes for the different analyses. All the
values of the statistic are very high and significant; this confirms the
high concordance between the outputs of the different methods.</p>
<div class="sourceCode" id="cb37"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb37-1"><a href="#cb37-1" tabindex="-1"></a>mat <span class="ot">&lt;-</span> <span class="fu">matrix</span>(<span class="cn">NA</span>, <span class="dv">4</span>, <span class="dv">4</span>)</span>
<span id="cb37-2"><a href="#cb37-2" tabindex="-1"></a>mat.names <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">&quot;PCA&quot;</span>, <span class="st">&quot;BCA&quot;</span>, <span class="st">&quot;PCAIV-POLY&quot;</span>, <span class="st">&quot;PCAIV-MEM&quot;</span>, <span class="st">&quot;MULTISPATI&quot;</span>)</span>
<span id="cb37-3"><a href="#cb37-3" tabindex="-1"></a><span class="fu">colnames</span>(mat) <span class="ot">&lt;-</span> mat.names[<span class="sc">-</span><span class="dv">5</span>]</span>
<span id="cb37-4"><a href="#cb37-4" tabindex="-1"></a><span class="fu">rownames</span>(mat) <span class="ot">&lt;-</span> mat.names[<span class="sc">-</span><span class="dv">1</span>]</span>
<span id="cb37-5"><a href="#cb37-5" tabindex="-1"></a></span>
<span id="cb37-6"><a href="#cb37-6" tabindex="-1"></a>mat[<span class="dv">1</span>, <span class="dv">1</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pca<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], bet<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-7"><a href="#cb37-7" tabindex="-1"></a>mat[<span class="dv">2</span>, <span class="dv">1</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pca<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], pcaiv.xy<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-8"><a href="#cb37-8" tabindex="-1"></a>mat[<span class="dv">3</span>, <span class="dv">1</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pca<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], pcaiv.mem<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-9"><a href="#cb37-9" tabindex="-1"></a>mat[<span class="dv">4</span>, <span class="dv">1</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pca<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], ms<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-10"><a href="#cb37-10" tabindex="-1"></a>mat[<span class="dv">2</span>, <span class="dv">2</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(bet<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], pcaiv.xy<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-11"><a href="#cb37-11" tabindex="-1"></a>mat[<span class="dv">3</span>, <span class="dv">2</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(bet<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], pcaiv.mem<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-12"><a href="#cb37-12" tabindex="-1"></a>mat[<span class="dv">4</span>, <span class="dv">2</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(bet<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], ms<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-13"><a href="#cb37-13" tabindex="-1"></a>mat[<span class="dv">3</span>, <span class="dv">3</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pcaiv.xy<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], pcaiv.mem<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-14"><a href="#cb37-14" tabindex="-1"></a>mat[<span class="dv">4</span>, <span class="dv">3</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pcaiv.xy<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], ms<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-15"><a href="#cb37-15" tabindex="-1"></a>mat[<span class="dv">4</span>, <span class="dv">4</span>] <span class="ot">&lt;-</span> <span class="fu">procuste.randtest</span>(pcaiv.mem<span class="sc">$</span>ls[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>], ms<span class="sc">$</span>li[, <span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>])<span class="sc">$</span>obs</span>
<span id="cb37-16"><a href="#cb37-16" tabindex="-1"></a></span>
<span id="cb37-17"><a href="#cb37-17" tabindex="-1"></a>mat</span>
<span id="cb37-18"><a href="#cb37-18" tabindex="-1"></a><span class="co">#&gt;                  PCA       BCA PCAIV-POLY PCAIV-MEM</span></span>
<span id="cb37-19"><a href="#cb37-19" tabindex="-1"></a><span class="co">#&gt; BCA        0.9788594        NA         NA        NA</span></span>
<span id="cb37-20"><a href="#cb37-20" tabindex="-1"></a><span class="co">#&gt; PCAIV-POLY 0.9791939 0.9897463         NA        NA</span></span>
<span id="cb37-21"><a href="#cb37-21" tabindex="-1"></a><span class="co">#&gt; PCAIV-MEM  0.9885944 0.9936217  0.9954116        NA</span></span>
<span id="cb37-22"><a href="#cb37-22" tabindex="-1"></a><span class="co">#&gt; MULTISPATI 0.9868799 0.9954034  0.9951133 0.9986471</span></span></code></pre></div>
</div>
<div id="references" class="section level1 unnumbered">
<h1 class="unnumbered">References</h1>
<div id="refs" class="references csl-bib-body hanging-indent">
<div id="ref-SD565" class="csl-entry">
Anselin, L. 1995. <span>“Local Indicators of Spatial
Association.”</span> <em>Geographical Analysis</em> 27: 93–115.
</div>
<div id="ref-SD566" class="csl-entry">
———. 1996. <span>“The <span>M</span>oran Scatterplot as an
<span>ESDA</span> Tool to Assess Local Instability in Spatial
Association.”</span> In <em>Spatial Analytical Perspectives on GIS</em>,
edited by M. M. Fischer, H. J. Scholten, and D. Unwin, 111–25. London:
Taylor; Francis.
</div>
<div id="ref-SD59" class="csl-entry">
Borcard, D., P. Legendre, and P. Drapeau. 1992. <span>“Partialling Out
the Spatial Component of Ecological Variation.”</span> <em>Ecology</em>
73: 1045–55.
</div>
<div id="ref-SD577" class="csl-entry">
Cliff, A. D., and J. K. Ord. 1973. <em>Spatial Autocorrelation</em>.
London: Pion.
</div>
<div id="ref-SD148" class="csl-entry">
Dolédec, S., and D. Chessel. 1987. <span>“Rythmes Saisonniers Et
Composantes Stationnelles En Milieu Aquatique <span>I</span>-
<span>D</span>escription d’un Plan d’observations Complet Par Projection
de Variables.”</span> <em>Acta Oecologica - Oecologia Generalis</em> 8
(3): 403–26.
</div>
<div id="ref-SD966" class="csl-entry">
Dray, S., and T. Jombart. 2011. <span>“Revisiting Guerry’s Data:
Introducing Spatial Constraints in Multivariate Analysis.”</span>
<em>Annals of Applied Statistics</em> 5 (4): 2278–99.
</div>
<div id="ref-SD163" class="csl-entry">
Dray, S., P. Legendre, and P. R. Peres-Neto. 2006. <span>“Spatial
Modeling: A Comprehensive Framework for Principal Coordinate Analysis of
Neighbor Matrices (<span>PCNM</span>).”</span> <em>Ecological
Modelling</em> 196: 483–93.
</div>
<div id="ref-SD807" class="csl-entry">
Dray, S., S. Saïd, and F. Débias. 2008. <span>“Spatial Ordination of
Vegetation Data Using a Generalization of <span>W</span>artenberg’s
Multivariate Spatial Correlation.”</span> <em>Journal of Vegetation
Science</em> 19: 45–56.
</div>
<div id="ref-SD223" class="csl-entry">
Geary, R. C. 1954. <span>“The Contiguity Ratio and Statistical
Mapping.”</span> <em>The Incorporated Statistician</em> 5 (3): 115–45.
</div>
<div id="ref-SD264" class="csl-entry">
Griffith, D. A. 1996. <span>“Spatial Autocorrelation and Eigenfunctions
of the Geographic Weights Matrix Accompanying Geo-Referenced
Data.”</span> <em>Canadian Geographer</em> 40 (4): 351–67.
</div>
<div id="ref-SD955" class="csl-entry">
Guérry, A. M. 1833. <em>Essai Sur La Statistique Morale de La
France</em>. Paris: Crochard.
</div>
<div id="ref-SD308" class="csl-entry">
Hotelling, H. 1933. <span>“Analysis of a Complex of Statistical
Variables into Principal Components.”</span> <em>Journal of Educational
Psychology</em> 24: 417–41.
</div>
<div id="ref-SD455" class="csl-entry">
Moran, P. A. P. 1948. <span>“The Interpretation of Statistical
Maps.”</span> <em>Journal of the Royal Statistical Society Series
B-Methodological</em> 10: 243–51.
</div>
<div id="ref-SD516" class="csl-entry">
Peres-Neto, P. R., and D. A. Jackson. 2001. <span>“How Well Do
Multivariate Data Sets Match? <span>T</span>he Advantages of a
<span>P</span>rocrustean Superimposition Approach over the
<span>M</span>antel Test.”</span> <em>Oecologia</em> 129: 169–78.
</div>
<div id="ref-SD540" class="csl-entry">
Rao, C. R. 1964. <span>“The Use and Interpretation of Principal
Component Analysis in Applied Research.”</span> <em>Sankhya A</em> 26:
329–59.
</div>
<div id="ref-SD626" class="csl-entry">
Student, W. S. 1914. <span>“The Elimination of Spurious Correlation Due
to Position in Time or Space.”</span> <em>Biometrika</em> 10: 179–80.
</div>
<div id="ref-SD694" class="csl-entry">
Wartenberg, D. 1985. <span>“Multivariate Spatial Correlation: A Method
for Exploratory Geographical Analysis.”</span> <em>Geographical
Analysis</em> 17 (4): 263–83.
</div>
</div>
</div>



<!-- code folding -->


<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</body>
</html>
